DELAYED ENCODING BASED JOINT VIDEO AND STILL IMAGE PIPELINE 
WITH STILL BURST MODE 



1 Technical Field 

2 The technical field relates to video imaging systems, and, in particular, to joint 

3 video and still image pipelines. 

4 Background 

5 Digital cameras are widely used to acquire high resolution still image 

6 photographs. Digital video cameras are also used to record home videos, television 

7 programs, movies, concerts, or sports events on a magnetic disk or optical DVD for 

8 storage or transmission through communications channels. Some commercial cameras 
y. 9 are able to take both digital video and digital still image photographs. However, most of 
"% 10 these cameras required a user to switch between a video recording mode and a digital still 
|3 1 1 image mode. Separate pipelines are generally used for each of the video recording and 
«J 12 still image modes. Examples of these cameras include SANYO ID-SHOT® and 

% 13 CANNON POWERSHOT S300®. The SANYO ID-SHOT® uses an optical disk, whereas 

L 14 the CANNON POWERSHOT S300® uses synchronous dynamic random access memory 

W 15 (SDRAM). However, both cameras are still image cameras that have the capability of 

#p 16 taking video clips, using separate pipelines. 

17 Other cameras use a single software pipeline to acquire both digital video and low 

1 y 

1 8 quality still images by taking one of the video frames as is, and storing the particular 

19 video frame as a high resolution still image. Examples of such cameras include JVC GR- 

20 DVL9800®, which is a digital video camera that allows a user to take a picture at certain 

21 point in time. However, the pictures taken generally are of low quality, because a low 

22 resolution video pipeline is used to generate the high resolution still image pictures. 

23 When still images are acquired in burst mode, current cameras try to process both 

24 pipelines independently. If a single hardware processing pipeline is used, a large frame 

25 buffer may be needed to store video frames while the burst mode still images are 

26 processed. However, a large frame buffer is costly, and build up delay on the video side 

27 may be undesirable. 

28 Other cameras try brute force real time processing, which is costly. 

29 Summary 

30 A method and corresponding apparatus for concurrently processing digital video 

3 1 frames and high resolution still images in burst mode include acquiring with high priority 
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1 video frames and high resolution still images in burst mode from one or more image 

2 sensors, and storing with high priority the video frames and the high resolution still 

3 images in raw format in a memory during acquisition of the high resolution still images in 

4 burst mode. The method and corresponding apparatus further include processing with 

5 low priority the video frames stored in the memory using a video pipeline, and processing 

6 the high resolution still images acquired during the burst mode using a high resolution 

7 still image pipeline. The high resolution still image pipeline runs concurrently with the 

8 video pipeline. 



9 In an embodiment, the video frames and the high resolution still images are 

1 0 acquired and stored in real time. In another embodiment, the high resolution still images 

1 1 are filtered and downsampled to be inputted into the video pipeline to make up 

12 deficiencies. In yet another embodiment, the video frames and the high resolution still 

1 3 images are processed into a standard format by an image/video transcoding agent. 

14 Description of the Drawings 

1 5 The preferred embodiments of the method and corresponding apparatus for 

1 6 concurrently processing digital video frames and high resolution still images in burst 

17 mode will be described in detail with reference to the following figures, in which like 

1 8 numerals refer to like elements, and wherein: 

1 9 Figure 1 illustrates an exemplary operation of an exemplary joint video and still 

20 image pipeline; 

21 Figure 2 illustrates a preferred embodiment of a video camera system using the 

22 exemplary j oint video and still image pipeline of Figure 1 ; 

23 Figure 3 illustrates an exemplary hardware implementation of the exemplary joint 

24 video and still image pipeline of Figure 1 ; 

25 Figures 4A- 4C are flow charts describing in general the exemplary joint video 

26 and still image pipeline of Figure 1 ; 

27 Figure 5 illustrates an exemplary multithread system for concurrently processing 

28 video frames and high resolution still images in burst mode; and 

29 Figures 6A and 6B illustrate an exemplary memory map to implement the 
3 0 multithread system of Figure 5 . 

3 1 Detailed Description 

32 A digital video camera system may utilize a joint video and still image pipeline 

33 that simultaneously acquires, processes, transmits and/or stores digital video and high 

34 resolution digital still image photographs. The joint pipeline may include a video pipeline 
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optimized for digital video frames and a high resolution still image pipeline optimized for 
high resolution digital still images. The digital video camera system may also 
concurrently acquire and process video frames and high resolution still image in burst 
mode using delayed encoding technology. The delayed encoding technology acquires 
video frames and burst mode still images in raw format without processing, and stores the 
video frames and the high resolution still images acquired during the burst mode into a 
memory or storage device. The video frames and the high resolution still images may be 
processed with low priority if extra time and processing power are available. The digital 
video camera system processes the stored video frames and the stored high resolution still 
images acquired during the burst mode after the burst mode or video recording stops. 

Figure 1 illustrates an exemplary operation of an exemplary joint video and still 
image pipeline, which is capable of simultaneously capturing digital video frames 120 
and high resolution digital still image frames 110. The video frames 120 may be acquired 
at, for example, 30 frames per second (fps). During video frame acquisition, a snapshot 
102 may be taken to acquire a particular still image frame 1 10 in high resolution, which is 
then processed. During the high resolution still image processing, all incoming video 
frames 120 that are captured during that time may be temporarily stored, i.e., buffered, in 
a frame buffer 330 (shown in Figure 3) before being processed. Both the video frames 
120 and the high resolution still image frame 110 may be stored or transmitted through 
communications channels, such as a network. 

Figure 2 illustrates a preferred embodiment of a video camera system 200 using 
the exemplary joint video and still image pipeline. In this embodiment, a video pipeline 
220 and a high resolution still image pipeline 210 share a same high resolution image 
sensor 240. The high resolution image sensor 240, which may be a charge coupled 
device (CCD) sensor or a complimentary metal oxide semiconductor (CMOS) sensor, 
may take high resolution still image frames 1 10 while acquiring medium resolution video 
frames 120. This embodiment is inexpensive because the video camera system 200 uses 
one hardware processing pipeline 300 (shown in Figure 3) with one image sensor 240 and 
one processor 360 (shown in Figure 3). 

The image sensor 240 typically continuously acquires high resolution video 
frames 120 at a rate of, for example, 30 fps. Each of the high resolution video frames 120 
may be converted into a high resolution still image photograph 110. When a user is not 
interested in taking a high resolution still image photograph 110, the only pipeline 
running may be the video pipeline 220, which acquires high resolution video frames 120, 
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and downsamples the frames to medium resolution (for example, 640x480), then 
processes the medium resolution video frames 120. When the user wants to acquire a 
high resolution still image frame 110, the image acquired by the high resolution image 
sensor 240 can be used both in the video pipeline 220 as well as in the high resolution still 
image pipeline 210 (described in detail later). 

The video camera system 200 may include a storage device 250 and a connection 
with a communications channel/network 260, such as the Internet or other type of 
computer or telephone networks. The storage device 250 may include a hard disk drive, 
floppy disk drive, CD-ROM drive, or other types of non- volatile data storage, and may 
correspond with various databases or other resources. After the video frames 120 and the 
high resolution still image frames 1 10 are acquired, the video frames 120 and the high 
resolution still image frames 110 may be stored in the storage device 250 or transmitted 
through the communication channel 260. The video camera system 200 may also include 
an image/video transcoding agent 270 for encoding the video frames 120 and the high 
resolution still image frames 110 into a standard format, for example, tagged image file 
format (TIFF) or Joint Photographic Experts Group (JPEG). 

Figure 3 illustrates an exemplary hardware implementation of the preferred 
embodiment of the exemplary joint video and still image pipeline. This embodiment 
includes the single hardware processing pipeline 300 supporting two software pipelines. 
A sensor controller 310 may be controlled by a user to retrieve high resolution mosaiced 
still image frames 1 10 at a rate of, for example, one every thirtieth of a second to generate 
a video signal. The sensor controller 310 may then store the selected high resolution still 
image frames 1 10 into a memory 320. The memory 320 may include random access 
memory (RAM) or similar types of memory. Next, the high resolution still image frames 
110 may be processed using a processor 360, which may be a microprocessor 362, an 
ASIC 364, or a digital signal processor 366. The ASIC 364 performs algorithms quickly, 
but is application specific and only performs a specific algorithm. On the other hand, the 
microprocessor 362 or the digital signal processor 366 may perform many other tasks. 
The processor 360 may execute information stored in the memory 320 or the storage 
device 250, or information received from the Internet or other network 260. The digital 
video and still image data may be copied to various components of the pipeline 300 over 
a data bus 370. 

In the video pipeline 220, the processor 360 may downsample, demosaic, and 
color correct the video frames 120. Next, the processor 360 may compress and transmit 
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1 the video frames 120 through an input/output (I/O) unit 340. Alternatively, the video 

2 frames 120 may be stored in the storage device 250. 

3 Both pipelines 2 1 0, 220 may be executed concurrently, i.e., acquiring high 

4 resolution still image photographs 110 during video recording. A frame buffer 330 may 

5 store video frames 120 while the processor 360 is processing the high resolution still 

6 image frame 1 10. The sensor controller 310 may still capture video frames 120 at a rate 

7 of, for example, 30 fps, and store the video frames 120 into the memory 320. The 

8 processor 360 may downsample the video frames 120 and send the downsampled video 

9 frames 120 into the frame buffer 330. The frame buffer 330 may store the downsampled 
10 video frames 120 temporarily without further processing. This may incur some delay in 

J: 1 1 the video pipeline 220 if the video is directly transmitted through the communications 

O 12 channel 260. However, this delay may be compensated by a similar buffer on the 

jri 13 receiver end. During video frame buffering, the high resolution still image frame 110 

j$f 14 may be processed by the processor 360, using complex algorithms. At the same time, the 

4* 15 video frames 120 may be continuously stored into the memory 320, downsampled, and 

p 16 sent into the frame buffer 330 to be stored. 

"p % 1 7 Although the video camera system 200 is shown with various components, one 

P 18 skilled in the art will appreciate that the video camera system 200 can contain additional 

pj 19 or different components. In addition, although the video frames 120 and the still image 

20 frames 110 are described as being stored in memory, one skilled in the art will appreciate 

21 that the video frames 120 and the still image frames 1 10 can also be stored on or read 

22 from other types of computer program products or computer-readable media, such as 

23 secondary storage devices, including hard disks, floppy disks, or CD-ROM; a carrier 

24 wave from the Internet or other network; or other forms of RAM or ROM. The 

25 computer-readable media may include instructions for controlling the video camera 

26 system 200 to perform a particular method. 

27 Figures 4 A- 4C are flow charts describing in general the exemplary joint video 

28 and still image pipeline. Referring to Figure 4 A, operation of the video pipeline 220, 

29 shown on the left, typically results in continuous processing of video frames 120. 

30 Operation of the high resolution still image pipeline 210, shown on the right, typically 

3 1 results in processing a high resolution still image frame 110 every time the user wants to 

32 acquire a high resolution photograph. 

33 After raw pixel video data of video frames 120 are acquired, for example, at 

34 1024x1008 and 30 fps (block 400), the video frames 120 may be downsampled and 
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demosaiced in* order to save memory space (block 410). Then, the frame buffer 330 may 
buffer the video frames 120 while the high resolution still image frame 1 10 is being 
acquired, processed, stored, and/or transmitted (block 420). Alternatively, demosaicing 
may be performed after the video frames 120 are buffered. Thereafter, the video pipeline 
220 may start emptying the frame buffer 330 as fast as possible, and performing color 
correction, compression, storage and/or transmission (blocks 430, 440, 450). Once the 
frame buffer 330 is emptied, another high resolution still image frame 110 may be 
acquired. 

For high resolution still image frames 110, sophisticated demosaicing may be 
performed (block 412), followed by high quality color correction (block 432). The high 
resolution still image frames 110 may optionally be compressed (block 442), and then 
stored and/or transmitted through similar communications channels 260 (block 452). 

Figure 4B illustrates in detail the operation of the high resolution still image 
pipeline 210. The sophisticated demosaicing process (block 412) utilizes a high quality 
demosaicing algorithm that generates a high quality color image from the originally 
mosaiced image acquired by the image sensor 240. The demosaicing process is a time 
consuming filtering operation, which may gamma-correct the input if the image sensor 
240 has not done so, resulting in excellent color image quality with almost no 
demosaicing artifacts. For example, demosaicing for high resolution still image frames 
110 may filter the original image with a 10x10 linear filter. The demosaicing algorithm 
takes into account the lens used for acquisition, as well as the spectral sensitivity of each 
of the color filters on the mosaic. 

Once the high resolution still image frame 1 10 is demosaiced, the high resolution 
still image frame 110 may be color corrected depending on the illumination present at the 
time of the capture (block 432). Complex transformation matrices may be used to restore 
accurate color to the high resolution still image frames 1 10, in order to generate an 
excellent photograph. The color correction algorithms, may be similar to the algorithm 
used in the HP-PHOTOSMART 618®. 

Figure 4C illustrates in detail the operation of the video pipeline 220. A high 
quality video pipeline 220 may demand large amount of processing power for 
computation. Because the video processing needs to be achieved at, for example, 30 fps, 
downsampling may be fast. In addition, lower resolution video frames 120 (for example, 
640x480 pixels) demands much less quality demosaicing (block 410), because the human 
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1 visual system may not notice certain artifacts at high video frame rates. For example, 

2 demosaicing for video frames 120 may filter the original image with a 4x4 linear filter. 

3 Similarly, color correction may be simpler because high quality is not needed on the 

4 video side (block 430). 

5 When a user acquires high resolution still images in burst mode, the digital video 

6 camera system 200 uses delayed encoding technology to acquire and store video frames 

7 and burst mode high resolution still images in raw format into the memory 320 or the 

8 storage device 250. 

9 The frame buffer 330 may be used for loss-less compression of the raw high 

1 0 resolution still image frames 1 10 and intermediate processing of the video frames 120 

1 1 until one of the high resolution still image frames 1 10 is acquired. The length of the burst 



O 12 mode and the amount of processing power define the size of the frame buffer 330, which 

*J 

r% 1 3 is preferably kept to minimum due to cost. The high resolution still image frames 110 

jjj 14 may be used to reset Moving Picture Experts Group (MPEG) encoding process as 

4* 15 intraframes (I-frames). I-frames are frames not compressed depending on previous or 

gl 16 future frames, i.e., stand alone compressed frames. I-frames do not depend on 

JJj 1 7 information from other frames to be compressed. Accordingly, all compression 

ffl 18 algorithms may start with an I-frame, and all other frames may be compressed based on 

K 19 the I-frame. 

20 After one of the I-frames are acquired, the processor 360 stores the video frames 

21 1 20 and high resolution still image frames 1 1 0 in raw format without any processing into 

22 the memory 320 or the storage device 250. If extra time and processing power are 

23 available, some stored video frames 120 and high resolution still image frames 110 may 

24 be processed. After the user stops video recording or acquiring high resolution still 

25 images in burst mode, the processor 360 starts processing the video frames 120 and the 

26 high resolution still image frames 1 10 in parallel. 

27 During the burst mode still image acquisition, a multithread system may be 

28 employed. Figure 5 illustrates an exemplary multithread system for concurrently 

29 processing video frames and high resolution still images in burst mode with different 

30 levels of priority. 

3 1 Referring to Figure 5, block 510 represents real time acquisition and storage of 

32 raw high resolution still image frames 1 1 0 at, for example, B fps. If the video frames are 

33 sampled at, for example, 30 fps, and B=3, the burst mode represents acquiring one high 

34 resolution still image every ten video frames. The high resolution still image frames 110 



HP 10017904-1 



7 



1 are typically stored in the memory 320 or the storage device 250. This process has high 

2 priority. Some loss-less compression may be conducted so that less storage is needed. 

3 Block 520 represents real time acquisition, downsampling, and storage of video 

4 frames 120 at, for example, (30-B) fps. The high resolution still image frames 110, for 

5 example, B frames, are inputted into the video processing pipeline 220. During 

6 processing, the high resolution still image frames 1 10 may be filtered and downsampled 

7 to generate lower resolution video frames to be inputted into the video processing 

8 pipeline 220 to make up the deficiency. For example, if video frames are sampled at 30 

9 fps, and high resolution still image frames are acquired at 3 fps, then one out often 

10 frames are sent to the high resolution still image pipeline 210. The frames are later 

1 1 downsampled and inputted into the video pipeline 220. Alternatively, the filtering and 
O 12 downsampling process may be performed in block 530 (described later). The video 

|j 13 frames 120 are also stored in raw format in the memory 320 or the storage device 250. 

W 14 This process also has high priority. 

,|; 15 In block 530, low priority video processing pipeline 220 processes and 

'JU 16 compresses buffered video frames 120 and the video frames 120 stored during process 

W 17 520. Therefore, while processes 510 and 520 have high priority, any extra time and 

CI 

III 1 8 processing power may be used to process and compress the stored video frames 120. 
~i 19 In block 540, low priority still image processing pipeline 210 processes and 

20 compresses each of the raw high resolution still image frames 110. Whenever extra time 

21 and processing power are available, the processors 360 may process small amount of high 

22 resolution still image frames 1 10. 

23 Processes 530 and 540 remain active with low priority until all the video frames 

24 1 20 and the high resolution still image frames 1 1 0 stored in processes 5 1 0 and 520 have 

25 been successfully encoded and stored. Therefore, the overall data is stored in real time, 

26 and low priority processes process the data in the background with non-real time 

27 processing, so as to reduce computational burden. Processes 510, 520, 530 and 540 may 

28 be implemented independently with the one or more processors 360. 

29 For example, 90% of time may be spent on processes 510 and 520, and 10% of 

30 time on processes 530 and 540. When the user stops the burst mode or video recording, 

3 1 the low priority processes 530 and 540 gain higher share of the total processing power. In 

32 the above example, if burst mode is stopped, process 520 is processed at 30 fps, as 

33 opposed to (30-B) fps, because no more high resolution still image frames 1 10 are 

34 acquired. 
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If memory space is available, the video camera system 200 continues to compress 
video frames 120 and still image frames 1 10 in the memory 320 or the storage device 
250. However, if the memory 320 or the storage device 250 is filled up with no extra 
space to process and compress new video frames 120 and burst mode high resolution still 
image frames 1 10, a flag may be used to signal that image and/or video acquisition needs 
to stop. Processes 530 and 540 may take advantage of the internal memory 320 and 
frame buffer 330 to continue processing and compressing the buffered video frames 120 
and the raw still images 110, thus freeing up some storage space for more image and/or 
video acquisition. If this is not achieved, then the video frames 120 and the high 
resolution still image frames 110 may be encoded at transmission/download time with the 
image/video transcoding agent 270. In other words, if the video frames or the still image 
frames are not fully encoded due to lack of memory space, the video frames and the still 
images frames can be encoded fully at download time by the image/video transcoding 
agent 270. 

Within the video camera system 200, the video frames 120 and the high resolution 
still image frame 1 10 may be kept in a nonstandard proprietary format. The image/video 
transcoding agent 270, which typically runs on the video camera system 200, detects 
when a video frame 120 or a high resolution still image frame 1 10 is to be downloaded 
and transcodes the proprietary loss-less (or near loss-less) raw video frame 120 or high 
resolution still image frame 1 10 into a processed video or image, which is then packed 
into a standard compression format, for example, TIFF or JPEG. Alternatively, the 
image/video transcoding agent 270 may run on a docking station or on the host personal 
computer (PC). 

Figures 6 A and 6B illustrate an exemplary memory map 600 to implement the 
multithread system of Figure 5. Referring to Figure 6 A, video compressed bitstream 620 
appears at the top of the memory map 600. When the user starts the burst mode, a marker 
640 is placed in the video bitstream 620, signaling that little processing or compressing 
occurs from that point in time. High priority processes 510, 520 perform real time 
acquisition and storage of video frames 120 and high resolution still image frames 1 10 in 
raw format during the burst mode. After the burst mode or video recording stops, or if 
extra time and processing power exist, low priority processes 530, 540 take over and 
resume processing. 

Video frames 120 and high resolution still image frames 110 acquired during the 
burst mode are stored in raw format at the bottom of the memory map 600. For example, 
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1 Si, S 7 , S13 are high resolution still image frames #1, #7, and #13, whereas V 2 , V 3 , V 4 , V 5 , 

2 V 6 , V 8> V 9j V 10 , Vn, V12, V14-V18 are video frames #2, 3, 4, 5, 6, 8, 9, 10, 1 1, 12, 14-18. 

3 In other words, in this example, three burst mode high resolution still image frames Si, 

4 S 7 , S13 are generated from 18 frames. After the high resolution still image frames 1 10 are 

5 acquired, or if extra time and processing power exist, low priority processes 530, 540 start 

6 processing the raw data, i.e., still image frames Si, S 7 , S B and the rest of the video 

7 frames. The low priority processes also combines the video frames 120 with filtered and 

8 downsampled versions of the high resolution still image frames 1 1 0 in order to generate a 

9 continuous compressed video sequence 120. For example, the processors 360 

1 0 downsample Si into Vj, S 7 into V 7 , and S 13 into V B , so that a continuous video sequence 

1 1 is generated, from Vi to Vig. 

12 Referring to Figure 6B, video before burst mode 621 are stored before the marker 

1 3 640, whereas video after the burst mode 622 are stored after the marker 640. The marker 

1 4 640 points to video sequence acquired during the burst mode 635, followed by another 

15 marker 645 pointing back to the video after the burst mode 622. Therefore, no 

1 6 discontinuation exists in the video sequence 1 20 . The high resolution still image frames 

17 Si, S 7 , S13 are processed and placed separately in the memory map 600 from the video 

1 8 sequence acquired during the burst mode 635. This linking mechanism in the memory 

1 9 map 600 is similar to computer file system. 

20 While the method and apparatus for concurrently processing digital video frames 

2 1 and high resolution still images in burst mode have been described in connection with an 

22 exemplary embodiment, those skilled in the art will understand that many modifications 

23 in light of these teachings are possible, and this application is intended to cover any 

24 variations thereof. 
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